human-centered artificial intelligence
Holistic Evaluation of Language Models
Liang, Percy, Bommasani, Rishi, Lee, Tony, Tsipras, Dimitris, Soylu, Dilara, Yasunaga, Michihiro, Zhang, Yian, Narayanan, Deepak, Wu, Yuhuai, Kumar, Ananya, Newman, Benjamin, Yuan, Binhang, Yan, Bobby, Zhang, Ce, Cosgrove, Christian, Manning, Christopher D., Ré, Christopher, Acosta-Navas, Diana, Hudson, Drew A., Zelikman, Eric, Durmus, Esin, Ladhak, Faisal, Rong, Frieda, Ren, Hongyu, Yao, Huaxiu, Wang, Jue, Santhanam, Keshav, Orr, Laurel, Zheng, Lucia, Yuksekgonul, Mert, Suzgun, Mirac, Kim, Nathan, Guha, Neel, Chatterji, Niladri, Khattab, Omar, Henderson, Peter, Huang, Qian, Chi, Ryan, Xie, Sang Michael, Santurkar, Shibani, Ganguli, Surya, Hashimoto, Tatsunori, Icard, Thomas, Zhang, Tianyi, Chaudhary, Vishrav, Wang, William, Li, Xuechen, Mai, Yifan, Zhang, Yuhui, Koreeda, Yuta
Language models (LMs) are becoming the foundation for almost all major language technologies, but their capabilities, limitations, and risks are not well understood. We present Holistic Evaluation of Language Models (HELM) to improve the transparency of language models. First, we taxonomize the vast space of potential scenarios (i.e. use cases) and metrics (i.e. desiderata) that are of interest for LMs. Then we select a broad subset based on coverage and feasibility, noting what's missing or underrepresented (e.g. question answering for neglected English dialects, metrics for trustworthiness). Second, we adopt a multi-metric approach: We measure 7 metrics (accuracy, calibration, robustness, fairness, bias, toxicity, and efficiency) for each of 16 core scenarios when possible (87.5% of the time). This ensures metrics beyond accuracy don't fall to the wayside, and that trade-offs are clearly exposed. We also perform 7 targeted evaluations, based on 26 targeted scenarios, to analyze specific aspects (e.g. reasoning, disinformation). Third, we conduct a large-scale evaluation of 30 prominent language models (spanning open, limited-access, and closed models) on all 42 scenarios, 21 of which were not previously used in mainstream LM evaluation. Prior to HELM, models on average were evaluated on just 17.9% of the core HELM scenarios, with some prominent models not sharing a single scenario in common. We improve this to 96.0%: now all 30 models have been densely benchmarked on the same core scenarios and metrics under standardized conditions. Our evaluation surfaces 25 top-level findings. For full transparency, we release all raw model prompts and completions publicly for further analysis, as well as a general modular toolkit. We intend for HELM to be a living benchmark for the community, continuously updated with new scenarios, metrics, and models.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.13)
- North America > United States > Washington > King County > Seattle (0.13)
- (49 more...)
- Research Report > New Finding (1.00)
- Personal (1.00)
- Overview (1.00)
- Media > News (1.00)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
- Law (1.00)
- (11 more...)
Aman's AI Journal • Read List
Foundation models is a term first popularized by the Stanford Institute for Human-Centered Artificial Intelligence. This paradigm has a host of benefits including: (i) instead of requiring a large, well-labelled dataset for the specific task, foundation models need a great amount unlabeled data and only a limited set of unlabeled data to fine-tune it for different downstream tasks thereby reducing the labeled data requirements dramatically, (ii) since a foundation model can be shared for different downstream tasks, we can save on the resources needed to train task-specific models owing to the knowledge transfer that foundation models bring about (training a relatively large model with billions of parameters roughly has roughly the same carbon footprint as running five cars over their lifetime), and (iii) democratizing AI research by making it much easier for small businesses to deploy AI in a wider range of mission-critical situations owing to the reduced data labeling requirements. Foundation models is a term first popularized by the Stanford Institute for Human-Centered Artificial Intelligence. This paradigm has a host of benefits including: (i) instead of requiring a large, well-labelled dataset for the specific task, foundation models need a great amount unlabeled data and only a limited set of unlabeled data to fine-tune it for different downstream tasks thereby reducing the labeled data requirements dramatically, (ii) since a foundation model can be shared for different downstream tasks, we can save on the resources needed to train task-specific models owing to the knowledge transfer that foundation models bring about (training a relatively large model with billions of parameters roughly has roughly the same carbon footprint as running five cars over their lifetime), and (iii) democratizing AI research by making it much easier for small businesses to deploy AI in a wider range of mission-critical situations owing to the reduced data labeling requirements.
Fellowship Programs
HAI Fellowship Programs offer opportunities to explore topics, conduct research, and collaborate across disciplines related to AI technologies, applications, or impact. The Institute for Human-Centered Artificial Intelligence (HAI) offers a 2-quarter program for Stanford Graduate Students. The goal of this program is to encourage interdisciplinary research conversations, facilitate new collaborations, and grow the HAI community of graduate scholars who are working in the area of AI, broadly defined. HAI is seeking graduate students to participate in this program. We would like to ensure the cohort is well-rounded across disciplines.
- North America > United States > California > Santa Clara County > Palo Alto (0.40)
- North America > United States > Alaska (0.05)
g-f(2)227 The Big Picture of Business Artificial Intelligence (4/17/2021), IEEE Spectrum, 15 Graphs You Need to See to Understand AI in 2021.
The massive document, produced by the Stanford Institute for Human-Centered Artificial Intelligence, is packed full of data and graphs, and we've plucked out 15 that provide a snapshot of the current state of AI. AI research is booming: More than 120,000 peer-reviewed AI papers were published in 2019. The money continues to pour in. Global corporate investment in AI soared to nearly $68 billion in 2020, an increase of 40 percent over the year before. Corporations are steadily increasing their adoption of AI tools in such industries as telecom, financial services, and automotive.
AI tool streamlines feedback on coding homework
This past spring, Stanford University computer scientists unveiled their pandemic brainchild, Code In Place, a project where 1,000 volunteer teachers taught 10,000 students across the globe the content of an introductory Stanford computer science course. Students in Code In Place evaluated the feedback they received using this carefully designed user interface. While the instructors could share their knowledge with hundreds, even thousands, of students at a time during lectures, when it came to homework, large-scale and high-quality feedback on student assignments seemed like an insurmountable task. "It was a free class anyone in the world could take, and we got a whole bunch of humans to help us teach it," said Chris Piech, assistant professor of computer science and co-creator of Code In Place. "But the one thing we couldn't really do is scale the feedback. To solve this problem, Piech worked with Chelsea Finn, assistant professor of computer science and of electrical engineering, and PhD students Mike Wu and Alan Cheng to develop and test a first-of-its-kind artificial intelligence teaching tool capable of assisting educators in grading and providing meaningful, constructive feedback for a high volume of student assignments. Their innovative tool, which is detailed in a Stanford AI Lab blogpost, exceeded their expectations. In education, it can be difficult to get lots of data for a single problem, like hundreds of instructor comments on one homework question. Companies that market online coding courses are often similarly limited, and therefore rely on multiple-choice questions or generic error messages when reviewing students' work. "This task is really hard for machine learning because you don't have a ton of data.
AIXPERIMENTATIONLAB -- Human-centered Artificial Intelligence
Artificial intelligence (AI) is considered a groundbreaking technology in many fields due to its potential to reproduce or exceed the cognitive performance of humans across a variety of applications. However, use cases so far have shown that a fully automated use of AI alone is not sufficient for efficient decision-making in real-world scenarios. For this reason, a more inclusive paradigm, commonly known as augmented intelligence, has emerged alongside the adoption of automated systems. It is based on the assumption that human and machine intelligence complement each other positively. As AI-powered technologies and processes increasingly find their way into a range of industries, the relevance of designing and developing augmenting systems for complex tasks has gained importance for their widespread adoption.
AI-controlled sensors could save lives in 'smart' hospitals and homes
As many as 400,000 Americans die each year because of medical errors, but many of these deaths could be prevented by using electronic sensors and artificial intelligence to help medical professionals monitor and treat vulnerable patients in ways that improve outcomes while respecting privacy. "We have the ability to build technologies into the physical spaces where health care is delivered to help cut the rate of fatal errors that occur today due to the sheer volume of patients and the complexity of their care," said Arnold Milstein, a professor of medicine and director of Stanford's Clinical Excellence Research Center (CERC). Milstein, along with computer science professor Fei-Fei Li and graduate student Albert Haque, are co-authors of a Nature paper that reviews the field of "ambient intelligence" in health care -- an interdisciplinary effort to create such smart hospital rooms equipped with AI systems that can do a range of things to improve outcomes. For example, sensors and AI can immediately alert clinicians and patient visitors when they fail to sanitize their hands before entering a hospital room. AI tools can be built into smart homes where technology could unobtrusively monitor the frail elderly for behavioral clues of impending health crises.
The AI Index 2021 Annual Report
Zhang, Daniel, Mishra, Saurabh, Brynjolfsson, Erik, Etchemendy, John, Ganguli, Deep, Grosz, Barbara, Lyons, Terah, Manyika, James, Niebles, Juan Carlos, Sellitto, Michael, Shoham, Yoav, Clark, Jack, Perrault, Raymond
Welcome to the fourth edition of the AI Index Report. This year we significantly expanded the amount of data available in the report, worked with a broader set of external organizations to calibrate our data, and deepened our connections with the Stanford Institute for Human-Centered Artificial Intelligence (HAI). The AI Index Report tracks, collates, distills, and visualizes data related to artificial intelligence. Its mission is to provide unbiased, rigorously vetted, and globally sourced data for policymakers, researchers, executives, journalists, and the general public to develop intuitions about the complex field of AI. The report aims to be the most credible and authoritative source for data and insights about AI in the world.
- Asia > India (0.46)
- Asia > Middle East > UAE (0.45)
- Asia > Middle East > Saudi Arabia (0.45)
- (124 more...)
- Research Report (1.00)
- Questionnaire & Opinion Survey (1.00)
- Overview (1.00)
- Instructional Material > Course Syllabus & Notes (1.00)
- Social Sector (1.00)
- Media > News (1.00)
- Leisure & Entertainment > Sports (1.00)
- (16 more...)
Latest 2021 Stanford AI Study Gives Deep Look into the State of the Global AI Marketplace
Despite major disruptions from the ongoing COVID-19 pandemic, global investment in AI technologies grew by 40 percent in 2020 to $67.9 billion, up from $48.8 billion in 2019, as AI research and use continues to boom across broad segments of bioscience, healthcare, manufacturing and more. The figures, compiled as part of Stanford University's Artificlal Intelligence Index Report 2021 on the state of AI research, development, implementation and use around the world, help illustrate the continually changing scope of the still-maturing technology. The 222-page AI Index 2021 report, touted as the school's fourth annual study of AI impact and progress, was released March 3 by Stanford's Institute for Human-Centered Artificial Intelligence. The report provides a detailed portrait of the AI waterfront last year, including increasing AI investments and use in medicine and healthcare, China's growth in AI research, huge gains in AI capabilities across industries, concerns about diversity among AI researchers, ongoing debates about AI ethics and more. "The impact of AI this past year was both societal and economic, driven by the increasingly rapid progress of the technology itself," AI Index co-chair Jack Clark said in a statement.
- North America > United States (0.74)
- Asia > China (0.30)
- Health & Medicine (1.00)
- Government > Regional Government > North America Government > United States Government (0.50)
- Law > Statutes (0.32)